首页> 外文OA文献 >Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited
【2h】

Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited

机译:自然语言是一种书法过程吗?关于事实和理论的定理   重温的话

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

As we discuss, a stationary stochastic process is nonergodic when a randompersistent topic can be detected in the infinite random text sampled from theprocess, whereas we call the process strongly nonergodic when an infinitesequence of independent random bits, called probabilistic facts, is needed todescribe this topic completely. Replacing probabilistic facts with analgorithmically random sequence of bits, called algorithmic facts, we adaptthis property back to ergodic processes. Subsequently, we call a processperigraphic if the number of algorithmic facts which can be inferred from afinite text sampled from the process grows like a power of the text length. Wepresent a simple example of such a process. Moreover, we demonstrate anassertion which we call the theorem about facts and words. This propositionstates that the number of probabilistic or algorithmic facts which can beinferred from a text drawn from a process must be roughly smaller than thenumber of distinct word-like strings detected in this text by means of the PPMcompression algorithm. We also observe that the number of the word-like stringsfor a sample of plays by Shakespeare follows an empirical stepwise power law,in a stark contrast to Markov processes. Hence we suppose that natural languageconsidered as a process is not only non-Markov but also perigraphic.
机译:正如我们所讨论的,当可以在从过程采样的无限随机文本中检测到随机持久性主题时,平稳随机过程是非遍历的,而当需要一个无限随机序列(称为概率事实)来描述此主题时,我们称该过程为非遍历性完全。用算术随机位序列(称为算法事实)替换概率事实,我们使此属性适应遍历过程。随后,如果可以从流程采样的有限文本中推断出的算法事实的数量像文本长度的幂一样增长,则我们称之为过程周刊。我们提供了这样一个过程的简单示例。此外,我们展示了断言,我们称其为事实和词语定理。该命题指出,可以从过程中提取的文本中推断出的概率或算法事实的数量必须比通过PPM压缩算法在该文本中检测到的不同的类似单词的字符串的数量大得多。我们还观察到,莎士比亚戏剧样本中的类单词字符串的数量遵循经验逐步幂定律,与马尔可夫过程形成鲜明对比。因此,我们认为自然语言不仅是非马尔可夫的,而且是个人的。

著录项

  • 作者

    Dębowski, Łukasz;

  • 作者单位
  • 年度 2017
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号